Module 01

Data science Friday

The remaining second level headers (##) are for separating data science Friday, regular course, and project content. In this module, you will only need to include data science Friday and regular course content; projects will come later in the course.

Installation check

Third level headers (###) should be used for links to assignments, evidence worksheets, problem sets, and readings, as seen here.

Use this space to include your installation screenshots.
Windows GitBash Terminal

RStudio

RStudio

GitHub homepage

GitHub homepage

Portfolio repo setup

Detail the code you used to create, initialize, and push your portfolio repo to GitHub. This will be helpful as you will need to repeat many of these steps to update your porfolio throughout the course.

starting from after registering for a new GitHub account
1. git init
2. git add .
3. git commit -m “First commit”
4. git remote add origin https://github.com/judyban/MICB425_portfolio
5. git remote -v
6. git push -u origin master

Repeat steps 2, 3 and git push to push new materials into the repository

RMarkdown pretty html challenge

The following assignment is an exercise for the reproduction of this .html document using the RStudio and RMarkdown tools we’ve shown you in class. Hopefully by the end of this, you won’t feel at all the way this poor PhD student does. We’re here to help, and when it comes to R, the internet is a really valuable resource. This open-source program has all kinds of tutorials online.
http://phdcomics.com/ Comic posted 1-17-2018

Challenge Goals

The goal of this R Markdown html challenge is to give you an opportunity to play with a bunch of different RMarkdown formatting. Consider it a chance to flex your RMarkdown muscles. Your goal is to write your own RMarkdown that rebuilds this html document as close to the original as possible. So, yes, this means you get to copy my irreverant tone exactly in your own Markdowns. It’s a little window into my psyche. Enjoy =)

hint: go to the PhD Comics website to see if you can find the image above
If you can’t find that exact image, just find a comparable image from the PhD Comics website and include it in your markdown

Here’s a header!

Let’s be honest, this header is a little arbitrary. But show me that you can reproduce headers with different levels please. This is a level 3 header, for your reference (you can most easily tell this from the table of contents).

Another header, now with maths

Perhaps you’re already really confused by the whole markdown thing. Maybe you’re so confused that you’ve forgotton how to add. Never fear! A calculator R is here:

1231521+12341556280987
## [1] 1.234156e+13

Table Time

Or maybe, after you’ve added those numbers, you feel like it’s about time for a table! I’m going to leave all the guts of the coding here so you can see how libraries (R packages) are loaded into R (more on that later). It’s not terribly pretty, but it hints at how R works and how you will use it in the future. The summary function used below is a nice data exploration function that you may use in the future.

library(knitr)
kable(summary(cars),caption="I made this table with kable in the knitr package library")
I made this table with kable in the knitr package library
speed dist
Min. : 4.0 Min. : 2.00
1st Qu.:12.0 1st Qu.: 26.00
Median :15.0 Median : 36.00
Mean :15.4 Mean : 42.98
3rd Qu.:19.0 3rd Qu.: 56.00
Max. :25.0 Max. :120.00

And now you’ve almost finished your first RMarkdown! Feeling excited? We are! In fact, we’re so excited that maybe we need a big finale eh? Here’s ours! Include a fun gif of your choice!

Origins and Earth Systems

Evidence worksheet 01

Whitman et al 1998

Learning objectives

Describe the numerical abundance of microbial life in relation to ecology and biogeochemistry of Earth systems.

General questions

  • What were the main questions being asked?
    • what is the estimated abundance of prokaryotes in various reservoirs on Earth?
    • what is the estimated abundance of the primary nutrients content of prokaryotes in reservoirs?
  • What were the primary methodological approaches used?
    • Data were pooled from literature, and extrapolation were done from averages of various characteristics of prokaryotes such as volume and density, and from environments such as porosity. In addition, carbon content were found from the average content of carbon produced per prokaryote.
    • In detail, the methodological appraoches used were:
      1. Averages of prokaryotes in unconsolidated sediments were calculated up to 600m; for 600m-4km, the number was extrapolated from the formula of Parkes et al.
      2. Extrapolation were done based on the assumption that the average porosity of the terrestrial subsurface is 3%, and prokaryotes occupies 0.015% of pore space. Volume of pore space occupied by prokaryotes/volume of prokaryotes = total # of prokaryotes.
      3. Carbon content were estimated from the average carbon content produced per prokaryote.
  • Summarize the main results or findings.
    • Earth’s prokaryotes contain 10-fold more nutrients than plants and represent the largest pool of these nutrients in living organisms. The total amount of prokaryotic carbon is 60-100% of the estimated total carbon in plants. The largest reservoir of prokayotes are the ocean, soil, oceanic and terrestrial subsurfaces, and the ocean had the largest turnover of prokaryotes. The large population size and rapid growth of prokaryotes can introduce mutations that leads to large capacity for genetic diversity.
  • Do new questions arise from the results?
    • Current data suggested a lot of uncertainty with the methods used, such as in the estimate of prokaryotes abndance in groundwater.
      There were also questions arised from these uncertainties, such as how does prokaryotic turnover affect the nutrient cycles? The genetic diveresity in prokaryotes is vast and there has been limiting understanding, the authors also is uncertain about how all the metabolic pathways of microbes can fit into the currently existing nutrient cycles.
  • Were there any specific challenges or advantages in understanding the paper (e.g. did the authors provide sufficient background information to understand experimental logic, were methods explained adequately, were any specific assumptions made, were conclusions justified based on the evidence, were the figures or tables useful and easy to understand)?
    • Challenges:
      • the authors did not provide enough background on prokaryote abundance on Earth to readers who lack the background, but rather jumped directly into providing numbers of organisms in each habitat.
      • Minimal explanation of calculation and methods.
    • Assumptions:
      • The authors were aware of the extrapolation of data from literature, but lacked criticism against the validity of values from these literatures.
      • The authors were also aware areas of conflict in data presented from different literatures.
    • Conclusions:
      • The results were impactful, there was evidence presented from a variety of primary literature to support the conclusion.
      • The figures and tables were straightforward and easy to understand.

Problem set 01

Whitman et al 1998

Learning objectives:

Describe the numerical abundance of microbial life in relation to the ecology and biogeochemistry of Earth systems.

Specific questions:

  • What are the primary prokaryotic habitats on Earth and how do they vary with respect to their capacity to support life? Provide a breakdown of total cell abundance for each primary habitat from the tables provided in the text.
Primary Habitats Total Cell Abundance
Open ocean \(1.2*10^{29}\)
Soil \(2.6*10^{29}\)
Subsurface \(3.8*10^{30}\)
Oceanic subsurface \(3.5*10^{30}\)
Terrestrial subsurface \(0.25*10^{30}\) to \(2.3*10^{30}\)
  • What is the estimated prokaryotic cell abundance in the upper 200 m of the ocean and what fraction of this biomass is represented by marine cyanobacterium including Prochlorococcus? What is the significance of this ratio with respect to carbon cycling in the ocean and the atmospheric composition of the Earth?

    • The estimated prokaryotic cell abundance in the upper 200 m of the ocean is 3.6 x 10^28, cell density is 5 x 10^5 CFUs/mL. Cyanobacteria has density of 4 x 10^4 cells/ml. 8% of total prokaryotic cell abundance is represented by marine cyanobacterium including Procholorococcus. Since they are autotrophs, they are the major players in driving one part of the carbon cycle by assimilating inorganic carbon into organic carbon through photosynthesis. They also drive the food web by being at the bottom of it and serving as primary producers.
  • What is the difference between an autotroph, heterotroph, and a lithotroph based on information provided in the text?

    • Autotroph – “self-nourishing” – uses inorganic carbon to produce complex organic carbon as a source of carbon for other organotrophs (fixing CO2 into biomass).

    • Heterotroph – uses organic carbon assimilated by autotrophs as sources of carbon. Assimilate organic carbon.

    • Lithotroph – technically obtains electron sources from inorganic chemicals, they can use material other than inorganic carbon to obtain reducing agents. Use inorganic substrates.

  • Based on information provided in the text and your knowledge of geography what is the deepest habitat capable of supporting prokaryotic life? What is the primary limiting factor at this depth?

    • The deepest habitat is oceanic subsurface that goes below ground up to 4 km. At increasing depth, high temperature of 125C, becomes a primary limiting factor to prokaryotic life. Change in temperature is about an increase of 22C/km.
  • Based on information provided in the text your knowledge of geography what is the highest habitat capable of supporting prokaryotic life? What is the primary limiting factor at this height?

    • The highest place prokaryotes can be found is 77km, but most bacteria found up there were transferred there instead of being native inhabitants. Realistic habitats is around 20km. I predict that most of them would be spore-forming bacteria due to desiccation and limited nutrients. While the paper did not provide a clear answer to the limiting factor, a thinning of atmospheric gasses at high altitudes limits the abundance of nutrients that can be provided for prokaryote uptake. In addition to a lack of moisture in the upper atmosphere that could lead to desiccation, UV is also a strong factor preventing life at that altitude.
  • Based on estimates of prokaryotic habitat limitation, what is the vertical distance of the Earth’s biosphere measured in km?

    • 20km (atmospheric) + 4km (subsurface) = 24km
  • How was annual cellular production of prokaryotes described in Table 7 column four determined? (Provide an example of the calculation)

    • (Population size) x (# of turnovers/year) = cells/year
    • Example Marine heterotrophs:
      • \(3.6\times10^{28} cells \times \frac{365 days per year}{16 turnovers} = 8.2\times10^{29} \frac{cells}{year}\)
  • What is the relationship between carbon content, carbon assimilation efficiency and turnover rates in the upper 200m of the ocean? Why does this vary with depth in the ocean and between terrestrial and marine habitats?

    • The equations were typed in excel due to lack of experience with LaTex. The screen shot of the equations is pasted here.
  • How were the frequency numbers for four simultaneous mutations in shared genes determined for marine heterotrophs and marine autotrophs given an average mutation rate of 4 x 10-7 per DNA replication? (Provide an example of the calculation with units. Hint: cell and generation cancel out)

    • \({\big({4\times10^7}\big)^4} \times {8.2\times10^{29}} = {2.1\times10^4} \frac{mutations}{year}\)
  • Given the large population size and high mutation rate of prokaryotic cells, what are the implications with respect to genetic diversity and adaptive potential? Are point mutations the only way in which microbial genomes diversify and adapt?

    • Large population size and high mutation rate can be considered a major source of genetic diversity and one of the essential factors that allows prokaryotes to first adapt , then evolve in different environments. While point mutations are common, they are not the only way to adapt. Other methods include horizontal transfer of genetic material between different prokaryotes.
  • What relationships can be inferred between prokaryotic abundance, diversity, and metabolic potential based on the information provided in the text?
    • Prokaryote abundance creates an opportunity for frequent mutations and genetic material exchanges, as seen in the upper 200m of marine habitat that has higher genetic diversity than in domestic animeals. The high diversity allows adaptation of prokaryotes to different habitats, which is demonstrated in open ocean prokaryotes having higher adaptive potential than those in soil, subsurface and domestic animals. Adaptation to unique environments will eventually lead to divergence in metabolic potential and ultimately evolution and diversity.

Evidence Worksheet 02

Nisbet et al 1998

Learning objectives:

Comment on the emergence of microbial life and the evolution of Earth systems

  • Indicate the key events in the evolution of Earth systems at each approximate moment in the time series. If times need to be adjusted or added to the timeline to fully account for the development of Earth systems, please do so.

    • 4.6 billion years ago
      • Hadean Eon
      • formation of the solar system
      • Inner planets recieved water vapour and carbon
      • heavy meteroite bombardment
    • 4.5 billion years ago
      • moon was formed, it gave Earth its spin & tilt, day & night cycles, seasons
    • 4.4 billion years ago
      • formation of Zircon (oldest mineral)
    • 4.2 billion years ago
      • continued hevy meteroite bombardment. Unlikely for Earth to be a permanent habitation before this time
      • earliest evidence of life present in sedimentary rock in Quebec in 2017
    • 4.1 billion years ago
      • evidence of life present in Zircon
    • 4.0 billion years ago
      • beginning of Archean Eon
      • oldest rock, Acasta gneiss
      • evidence of plate subduction
    • 3.8 billion years ago
      • meteorite bombardment halted
      • condensation of atmospheric water into oceans
      • speculated to be the lower boundary of the beginning of life
      • oldest known water-lain sedimentary rock found
    • 3.5 billion years ago
      • evidence of photosynthesis from stromatolites
      • evidence of Rubisco signature indicates global oxygenic photosynthesis and the evolutin of cyanobacteria
      • sulphate detected in rocks, signifies localized non-reducing conditions
    • 3.0 billion years ago
      • well-developed stromatolites
      • global glaciation
    • 2.7 billion years ago
      • ancestral eukaryotes appeared
    • 2.5 billion years ago
      • Beginning of Proterozoic Eon
    • 2.2 billion years ago
      • Great Oxidation Event, sharp increase of atmospheric oxygen
    • 1.7 billion years ago
      • evolution of multicellular eukaryotes
    • 1.3 billion years ago
      • evidence for evolution of land fungi
    • 540 million years ago
      • beginning of Phanerozoic Eon
      • Cambrian explosion
    • 480 million years ago
      • Devonian explosion
      • land plants appear
    • 200,000 years ago
      • first appearance of Homo sapiens
  • Describe the dominant physical and chemical characteristics of Earth systems at the following waypoints:

    • Hadean
      • molten Earth due to extreme volcanisms and frequent collisions
      • atmosphere consist of high water vapor, high carbon dioxide, nitrogen
      • early oceans produced by water vapour
      • moon-formation formed rock vapour atmosphere
      • had a range of atmospheric temperatures, from 100C CO2-greenhouuse to glacial Ice-Hades with intervals of warm atmosphere after major impacts
      • biochemicals that are the prerequisites for the origin of life were present in Late Hadean
    • Archean
      • temperatures similar to modern day due to young faint sun
      • Presence of reducing atmosphere: H2O, CH4, H2 and NH3
      • intense UV without ozone
    • Precambrian
      • glacial Earth known as Snowball Earth
      • atmosphere hypothesized to composed primarily of nitrogen, CH2 and other inert gases
      • some oxygen present but not in significant amounts
    • Proterozoic
      • accumulation of oxygen in atmosphere
      • occurance of first known glaciations
    • Phanerozoic
      • abundant animal and plant life
      • normal amount of atmospheric oxygen comparable to today

Problem set 02

Falkowski et al 1998

Learning objectives:

Discuss the role of microbial diversity and formation of coupled metabolism in driving global biogeochemical cycles.

Specific Questions:

  • What are the primary geophysical and biogeochemical processes that create and sustain conditions for life on Earth? How do abiotic versus biotic processes vary with respect to matter and energy transformation and how are they interconnected?

    • Geophysical processes are tectonics and atmospheric photochemical processes that continuously supply substrates and remove products to create geochemical cycles. Geological supply of C, S and P are dependent on tectonics, such as volcanism and rock weathering. The biological fluxes of the six major elements of life: H, C, N, O, S are largely catalyzed by microbes with redox reactions to sustain life on Earth.
    • Abiotic processes are mostly driven by acid/base chemistry (transer of protons without electrons) and biotic processes are driven by redox reactions (sucessive transfers of electrons and protons from a relatively limited set of chemical elements). In particular for Earth, the biological oxidation is driven by photosynthesis.
  • Why is Earth’s redox state considered an emergent property?

    • The biotic and abiotic processes altered the surface redox state of the planets. It is the feedbacks between the metabolic and geological processes that create the average redox condition of the oceans and atmosphere.
  • How do reversible electron transfer reactions give rise to element and nutrient cycles at different ecological scales? What strategies do microbes use to overcome thermodynamic barriers to reversible electron flow?
    • On a community scale such as methanogens, the redox process requires cooperation of multiple species. For example, methane is reduced from CO2 and H2 by methanogenic Archaea. Hydrogen-consuming sulfate reducers present in vicinity will reduce hydrogen concentrations and cause the reverse process to become thermodynamically favorable. The methane can then be oxidized by other species to release H2. This inevitably contribute to the biogeochemical cycles of atmospheric trace gases. Whereas in a microbiological scale inside the cell, some cycles can be reversible, like the TCA. Some archaea can use TCA to both oxidize organic carbon into CO2 to release energy and also assimilate CO2 into organic matter by using energy. The energy produced from oxidation process can feed into the reductive process. In addition, this process contributes to the maintenance of balance in the carbon cycle on a global scale.
  • Using information provided in the text, describe how the nitrogen cycle partitions between different redox niches and microbial groups. Is there a relationship between the nitrogen cycle and climate change?
    • the reason behind the divided niche is in part due to organisms occupying aerobic vs. anaerobic environments. For example, the enzyme nitrogenase that catalyzes nitrogen fixation is inhibited by oxygen, so that step can only be performed by anaerobes. Meanwhile, nitrification is favourable to aerobic bacteria.
    • Nitrogenous gases also play an important role in global climate change. Nitrous oxide is a particularly potent greenhouse gas as it is over 300 times more effective at trapping heat in the atmosphere than carbon dioxide. Nitrogen from fertiliser, effluent from livestock and human sewage also boost the growth of algae and cause water pollution.
  • What is the relationship between microbial diversity and metabolic diversity and how does this relate to the discovery of new protein families from microbial community genomes?

    • Metabolic diversity gives various functional phenotypes that confer fitness to groups of organisms, and help them adapt to new environments. Thus, creating microbial diversity. Metabolic diversity can be obtained through evolution or horizontal gene transfer, as the two principle modes of evolution. In the event of HGT, metabolic genes, or sometimes even the entire metabolic apparatus (ie. the photosystems) can be transferred to various groups. The genes responsible for major metabolic processes or entire metabolic pathways that have persisted, have likely been distributed in a common gene pool before further differentiation of species. Afterwards, nutritional and bioenergetic selective pressures drive the retention of these horizontally transferred genes.
  • On what basis do the authors consider microbes the guardians of metabolism?

    • Microbes can maintain genes for metabolism through HGT. Even if one community becomes extinct, the wide-spread distribution of these genes across many communities and niches will allow the metabolic pathways to be preserved.

Evidence Worksheet 03

Rockstrom et al 2009

Learning objectives:

Evaluate human impacts on the ecology and biogeochemistry of Earth systems.

General questions

  • What were the main questions being asked?

    • What are the key variables, or the planet’s biophysical subsystems or processes, that are negatively affected by human operation?
    • How can we define the parameters and threshold for each planetary boundary that acts as guidelines for safe operating space for humanity?
    • How far has humanity pushed each of the planetary boundaries and what are the consequences?
  • What were the primary methodological approaches used?
    • Planetary boundaries are determined by values for control variables that are either a “safe” distance from thresholds by looking at evidence of threshold behavior of certain processes, or at dangerous levels for processes that do not have a well-defined threshold. The large uncertainties surrounding the true threshold is taken into consideration. A safe distance is determined by involving normative judgements of how societies choose to deal with risk and uncertainty.
  • Summarize the main results or findings.
    • The nine planetary boundaries defined were: atmospheric aerosol loading, chemical pollution, climate change, ocean acidification, stratospheric ozone depletion, nitrogen and phosphorus cycle, global freshwater use, change in land use, biodiversity loss.
    • Three of the Earth-system processes have already transgressed their boundaries, and their continued deterioration will significantly impact the resilience of major components of Earth-system functioning.They are:
      • Climate change
      • Biodiversity loss
      • Interference with the nitrogen cycle
    • Climate change: the current CO2 concentration has transgressed past the boundary due to underestimation of the long-term effects of greenhouse gases and may lead to long-term irreversible climate change such as loss of major ice sheets, accelerated sea level rise and shifts in biological systems
    • Rate of biodiversity loss: the Anthropocene has accelerated the rate of extinction of species to 100-1000 times more than the natural process, mainly due to loss of habitats from changes in land use and speed of climate change
    • Nitrogen and phosphorus cycles: additional input of nitrogen and phosphorus from human large-scale production has begun to significantly disturb their global cycles
  • Do new questions arise from the results?
    • The parameters and boundary of biodiversity, human modification of the nitrogen cycle is vague, more research is required to pin down this boundary with greater certainty
    • More understanding of the essential Earth processes and human actions are needed in order to push global change research and sustainability science
    • Exploration into the complex dynamic interactions and self-regulation of living systems is also required to better appreciate thresholds and shifts between states. This will help us realize the severity of environmental conditions.
  • Were there any specific challenges or advantages in understanding the paper (e.g. did the authors provide sufficient background information to understand experimental logic, were methods explained adequately, were any specific assumptions made, were conclusions justified based on the evidence, were the figures or tables useful and easy to understand)?
    • The paper is written in a way that is easy to understand, and the figures are well represented and explained in relation to the analysis. Although the methods of defining the boundaries and threshold were explained inadequately for biodiversity and nitrogen cycle, the authors did emphasize the uncertainty in the definitions of boundaries used in the paper. Yes, there were assumptions made when the authors speculated the long-term consequences of transgressing the threshold of the boundaries. However, I believe the conclusions are largely justified as the paper gathered quantitative evidence from a variety of sources detailing how has human actions altered the Earth’s natural cycles.

Writing assessment 01

“Microbial life can easily live without us; we, however, cannot survive without the global catalysis and environmental transformations it provides.” Do you agree or disagree with this statement? Answer the question using specific reference to your reading, discussions and content from evidence worksheets and problem sets

For billions of years, microbial life drove the establishment of Earth’s biogeochemical cycles and coevolved with ancestors that ultimately lead to the evolution of Homo sapiens. But does the future of our planet lie solely in the hands of humans, while microbes are merely a remnant of the past? Ultimately, humans have been reliant on microbes and will continue to depend on them to ensure survival of our species in the following three ways. Firstly, microbes being an essential part of Earth’s biogeochemical cycles have established the Earth suitable for human survival. Secondly, human industrialization has rapidly expanded technologies that can rival certain microbial processes in a large scale, but in an unbalanced manner that has resulted in biogeochemical cycle disruptions. Lastly, humans must eventually rely on microbial metabolism to re-establish a balanced biogeochemical system. Humans live in a world that would not have existed without microbes, but microbes can easily continue to thrive without humans.

Microbial influence on altering Earth’s atmosphere and biogeochemical landscapes

Microbial life has intimately provided metabolic processes that drives Earth’s atmosphere and geochemical landscape for the survival of humans. The earliest evidence of microbial activity was present at least 3.5 billion years ago (Gya) (1), derived from stable carbon-isotope and sulfur-isotope fractionation that suggested possible sulfate reduction and methanogenesis (2). Along with volcanic release of mantle gasses, these methanogenic bacteria have in part contributed to an anoxic early earth atmosphere rich in carbon dioxide and methane (2). The greenhouse-effect from atmospheric methane was speculated to have prevented a glacial Earth under the faint young sun, with only one third of solar radiation than the modern sun (3). Even in our modern atmosphere, micro-organisms such as methanogens continue to contribute to the trace gases, whose sources are almost entirely biological (4). Approximately 2.45 Gya, oxygenic photosynthesis performed by cyanobacteria lead to an initial rise in atmospheric oxygen. Specifically, the removal of carbon from the buried organic matter in sediments by oxygenic phototrophy produced oxygen, which cannot re-enter the marine carbon cycle and thus remained in the atmosphere (4). These microbes produced rapid rise of atmospheric oxygen as by-products, and lead to the Great Oxidation Event (1). The rapidly rising reactive oxygen species in the oxygenated atmosphere were toxic to most anaerobes, and resulted in mass extinction of anaerobic niches (3). This selective pressure aided the widespread establishment of aerobic microorganisms, and ultimately paved way for the evolution of complex organisms today. The microbial catalysis of geochemical processes as a by-product of their metabolism have played an important role in shaping the environment in which we currently live in.

Industrial replacement of microbial processes lead to disruptions of biogeochemical cycle

The rapidly increasing human population and technological leaps have triggered a new Anthropocene landscape that requires constant management to maintain ecological balance, one that we have a difficult time keeping up. An example of this is the Haber-Bosch process for industrial nitrogen fixation. Humans are capable of replacing microbial metabolism with nitrogen fixation that is twice the natural nitrogen cycle (5). However, the Haber-Bosch process has resulted in a disproportional input of nitrogen that not only impacts the nitrogen cycle, but also affects other natural cycles of chemical elements. The distortion of basic biogeochemical cycles across the globe is a result of the decoupling of intimate interactions between the natural cycles (6). For example, the high input of industrially synthesized ammonia in fertilizers can be converted by nitrifying bacteria into the highly mobile nitrate, which leaches into aquatic ecosystems. The overabundance of nitrogen in this ecosystem positively feeds-forward the carbon cycle by increasing organic carbon, which removes oxygen during decomposition (7). This results in hypoxic zones around the world and creates huge loss of aquatic biodiversity (7). Ultimately, disruption in the interdependence of natural cycles will eventually require constant human intervention to maintain biogeochemical balance (6). The advent of industrial processes that carry out microbial metabolism has resulted in the disruption of natural biogeochemical cycles, humans are limited as interventionalists in dictating the future of our planet without microbes.

The role of microbes in biogeochemical cycles as a solution to environmental disruptions

The industrial processes implemented to supplement microbial metabolism have wrecked havoc on the natural biogeochemical cycles, and humans must turn to microbes to re-establish a balanced biogeochemical system. Despite earlier mentions of human industrial processes, microbes today continue to support human sustainability through reduction-oxidation (redox) reactions. They work with abiotic processes to maintain the balance of the six major bio-elements, hydrogen (H), carbon (C), nitrogen (N), oxygen (O), sulfur (S) and phosphorus (P) (2). Of all the biogeochemical cycles, the nitrogen cycle is the only one that is entirely biologically driven (2). The biological N cycle is balanced by five different reactions - nitrogen fixation, nitrification, anammox, denitrification and ammonification, which are all maintained by the synergistic cooperation of multiple microbes (8). Without human intervention, nitrogen fixation is the only biological process that converts N2 into ammonium (NH4+), which is carried out by certain microbes expressing enzyme nitrogenase. In agriculture, legume plants form symbiotic associations with some nitrogen-fixing species of Rhizobium, for accessible forms of N used in proteins and nucleic acids synthesis (8). NH4+ can be oxidized back into nitrate in a two-step nitrification pathway, then denitrification converts the product into N2 to complete the nitrogen cycle (8). As nitrogenase is highly sensitive to oxygen, each step of the cycle is catalyzed by specific group of bacteria in the presence of oxygen, and they work synergistically to maintain balance in the input and output of global nitrogen resources (2). Modern agriculture is a major case of environmental pollution, there is a need for more sustainable solutions to nitrogen fertilizer use. Humans have used techniques such as intercropping and crop rotation to help naturally increase nitrogen content of the soil, but these methods are time consuming (8). Scientists are now looking at biological engineering to introduce nitrogen-fixing microbial metabolism to non-legume crops, in order to reduce the requirement for exogenous fertilization (9). Currently there are two strategic approaches in non-legume plants: investigation of non-legume nitrogen-fixing bacteria, and genetic engineering of signalling pathway to create a rhizobium-friendly environment in the plant nodules (9). With biological nitrogen sources, microbes can balance the natural feedback system in a more constrained manner, to produce a new steady state of the N cycle over time (3). Thus, microbes have not only improved human quality of life, they may also be the solution to ensure planetary and human sustainability.

Conclusion

Microbes play an essential role in driving important biogeochemical cycles that ultimately developed into a suitable environment for human survival in the past, present and future. The global catalysis as a result of photosynthetic metabolic by-products helped established an oxygenated environment appropriate for human evolution. However, humans’ lack of understanding on the consequences of our current actions have lead to imbalances in biogeochemical cycles. With more understanding of microbial metabolism, the pathways may be used for renewable sources of materials, and has the potential to save the Earth from the destructions created by our ignorance. In this sense, humans must continue to explore for a complete picture of microbial metabolic diversity, to ensure self-sustainability in the long run.

Writing Assignment 01 references

  1. Nisbet, EG, Sleep, NH. 2001. The habitat and nature of early life. Nature. 409:1083-1091.PMID11234022
  2. Falkowski, PG, Fenchel,T, Delong, EF. 2008. The Microbial Engines That Drive Earth’s Biogeochemical Cycles. Science. 320:1034-1039. PMID18497287
  3. Sessions, AL, Doughty, DM, Welander, PV, Summons, RE, Newman, DK. 2009. The Continuing Puzzle of the Great Oxidation Event. Current Biology. 19:R574. PMID119640495
  4. Kasting, J, Siefert, J. 2003. Life and the evolution of Earth’s atmosphere. Science. 299:1015-1015.PMID112655895
  5. Gilbert, JA, Neufeld, JD. 2014. Life in a World without Microbes. PLoS Biol. 12:12. PMC4267716
  6. Falkowski, PG. 2015. Life’s Engines: How Microbes Made Earth Habitable. Princeton University Press, Princeton, NJ, U.S.A.
  7. Canfield, DE, Glazer, AN, Falkowski, PG. 2010. The evolution and future of Earth’s nitrogen cycle. Science. 330:192-196.PMC 20929768
  8. Bernhard, A. 2010. The nitrogen cycle: processes. players, and human impact. Nature Education Knowledge. 3:25 Nature Education
  9. Dent, D, Cocking, E. 2017. Establishing symbiotic nitrogen fixation in cereals and other non-legume crops: The Greener Nitrogen Revolution. Agriculture & Food Security. 6:7.https://doi.org/10.1186/s40066-016-0084-2

Module 01 references

Achenbach, J. 2012. Spaceship Earth: A new view of environmentalism. Washington Post (January 2, 2012).Washington Post

Bernhard, A. 2010. The nitrogen cycle: processes. players, and human impact. Nature Education Knowledge. 3:25 Nature Education

Canfield, DE, Glazer, AN, Falkowski, PG. 2010. The evolution and future of Earth’s nitrogen cycle. Science. 330:192-196.PMID20929768

Dent, D, Cocking, E. 2017. Establishing symbiotic nitrogen fixation in cereals and other non-legume crops: The Greener Nitrogen Revolution. Agriculture & Food Security. 6:7.https://doi.org/10.1186/s40066-016-0084-2

Falkowski, PG, Fenchel, T, Delong, EF. 2008. The microbial engines that drive Earth’s biogeochemical cycles. Science. 320:1034-1039.PMID20929768

Falkowski, P, Scholes, RJ, Boyle, E, Canadell, J, Canfield, D, Elser, J, Gruber, N, Hibbard, K, Hugberg, P, Linder, S. 2000. The global carbon cycle: a test of our knowledge of earth as a system. Science. 290:291-296.PMID11030643

Fuerst, JA, Sagulenko, E. 2011. Beyond the bacterium: planctomycetes challenge our concepts of microbial structure and function. Nature Reviews Microbiology.9:403.PMID20929768

Gilbert, JA, Neufeld, JD. 2014. Life in a World without Microbes. PLoS Biol. 12:12. PMC4267716

Kallmeyer, J, Pockalny, R, Adhikari, RR, Smith, DC, D’Hondt, S. 2012. Global distribution of microbial abundance and biomass in subseafloor sediment. Proceedings of the National Academy of Sciences. 109:16213-16216.PMID22927371

Kasting, JF, Siefert, JL. 2002. Life and the evolution of Earth’s atmosphere. Science. 296:1066-1068.PMID12004117

Leopold, A. 2014. The land ethic, p. 108-121. In Anonymous The Ecological Design and Planning Reader. Springer.Springer

Mooney, C. 2016. Scientists say humans have now brought on an entirely new geologic epoch. The Washington Post. Article

Nisbet, EG, Sleep, NH. 2001. The habitat and nature of early life. Nature. 409(6823):1083-1091. PMID11234022

Rockstrom J, Steffen W, Noone K, Persson A, Stuart Chapin II F, Lambin EF, Lenton TM …Foley JA. 2009. A safe operating spce for humanity. Nature. 461:47-475. PMID19779433

Schrag, DP. 2012. Geobiology of the Anthropocene. Fundamentals of Geobiology. 425-436.https://doi.org/10.1002/9781118280874.ch22

Sessions, AL, Doughty, DM, Welander, PV, Summons, RE, Newman, DK. 2009. The Continuing Puzzle of the Great Oxidation Event. Current Biology. 19:R574. PMID119640495

Waters, CN, Zalasiewicz, J, Summerhayes, C, Barnosky, AD, Poirier, C, Galuszka, A, Cearreta, A, Edgeworth, M, Ellis, EC, Ellis, M. 2016. The Anthropocene is functionally and stratigraphically distinct from the Holocene. Science. 351:aad2622.PMID26744408

Whitman WB, Coleman DC, and Wiebe WJ. 1998. Prokaryotes: The unseen majority. Proc Natl Acad Sci USA. 95(12):6578-6583. PMC33863

Module 02

Rempping the Body of the World

Evidence worksheet 04

Martinez et al 2007

Learning objectives

Discuss the relationship between microbial community structure and metabolic diversity.
Evaluate common methods for studying the diversity of microbial communities.
Recognize basic design elements in metagenomic workflows

General questions

  • What were the main questions being asked?

    • Whare are the minimum number of genes needed to generate a fully functional photorhodopsin (PR) system?
    • How did the PR system become so ubiquitous among diverse microbial taxa?
    • Characterization of each gene product in the photosystem biosynthetic pathway and the specific functionality of the marine PR system
  • What were the primary methodological approaches used?

    • Screening Fosmid (large-insert DNA) libraries derived from marine picoplankton for visibly detectable PR-expressing phenotypes in vivo. Since PR system requires retinal, a molecule that E. coli cannot produce, screening were down on retinal-containing medium. This method unexpectedly screened for clones with PR epressing phenotype in the absence of retinal in the media. Then, sequencing of the PR these clones were conducted to verify whether the retinal biosynthesis gens have been introduced. These genes were subsequently identified by sequencing.
    • Transposon insertion were used to disrupt putative retinol-biosynthesis operon to characterize the role of each gene in the the biosynthesis pathway
  • Summarize the main results or findings.
    • The clones exhibited highest identity to Alphaproteobacteria.
    • The clones contained six-gene operon encoding putative enzymes for beta-carotene and retinal biosynthesis. These 6 genes alone can enable light-activated photophorylation in a heterologous host such as E. coli. In addition, this operon can work together with the PR system to express a functional phenotype with light-activated proton-translocating activity, and sufficient to drive photophorphorylation.
    • The retinol-biosynthesis operon is both neccessary and sufficient or the complete synthesis and assembly of a fully functional PR photoprotein in E. coli. Also, PRs appear to function as light-activated ion pumps that positively contributes to cellular energy metabolism.
    • A single genetic event, such as horizontal gene transfer, can result in the acquisition of phototrophic capabilities in a chemoorganotrophic microorganism, in part providing evidence for the wide-spread occurance of PR systems among diverse microbial taxa. The PR system is also speculated to spread throughout different microbial lineages because it has importance in cellular bioenergetics, it is simple and compact and has plasticity that enables it to persevere in diverse phylogenetic groups.
  • Do new questions arise from the results?
    • The authors have suggested to further strengthen the established link between PR and light-induced growth stimulation in Flavobacteria, as there are conflicting results presented in different studies.
  • Were there any specific challenges or advantages in understanding the paper (e.g. did the authors provide sufficient background information to understand experimental logic, were methods explained adequately, were any specific assumptions made, were conclusions justified based on the evidence, were the figures or tables useful and easy to understand)?
    • The figures depicting results of gene-disruption in the retinol-biosynthesis pathway were easy to understand, as they were presented in a way that allows readers to understand the steps of the pathway first, before presenting the outcome of each gene-disruption. However, I felt that there was insufficient background information provided on the PR systems, and why retinol was required. As a reader who does not know the role of the PR system in a cell, I found it difficult to grasp the logic behind the experimental methods employed. I also wished that more explanation could be provided for the methods section on why each specific technique was chosen for the experiment. Because of a lack of background information, it was difficult to establish the assumptions that underlie the chosen genetic analysis techniques. Nonetheless, I do think that the conclusions were justified based on the transposon-insertion gene-disruption method, as the phenotypic outcome can be visually observed.

Problem set 03

Alain et al 2009

Li et al 2014

Mithcell et al 2017

Papudeshi et al 2017

Parks et al 2017

Prakash et al 2012

Rapple et al 2003

Rastogi et al 2011

Solden et al 2016

Learning objectives:

Specific emphasis should be placed on the process used to find the answer. Be as comprehensive as possible e.g. provide URLs for web sources, literature citations, etc.
(Reminders for how to format links, etc in RMarkdown are in the RMarkdown Cheat Sheets)

Specific Questions:

  • How many prokaryotic divisions have been described and how many have no cultured representatives (microbial dark matter)?

    • As of 2016, there has been 89 bacterial phyla, 20 archael phyla described by small 16s rRNA databases. However, this could be up to 1500 bacterial phyla as there are microbes that live in the “shadow biosphere”, which is a microbial biosphere containing microbes that employs metabolic processes that are radically different than currently known life (Solden et al, 2016).
    • In 2003, studies have described 26 out of 52 major bacterial phyla identified from gene-sequence analysis have been cultivated (Rappe et al, 2003). In 2008, the number of identified bacterial phyla have increased to 100, but only 30 possess a cultivated representative. With advancing understanding of microbial diversity and metabolic processes, as well as better developed culturing technologies, there are likely more previously “unculturable” prokaryotic divisions cultured (Alain et al, 2009).
  • How many metagenome sequencing projects are currently available in the public domain and what types of environments are they sourced from?
    • As of 2017, there are nearly 70,000 bacterial and archaeal genomes on public repositories (Parks et al, 2017). Some of the online software tools for metagenomics studies include MG-RAST, IMG/IM, EBI Metagenomics, METAVIR, etc. From the EBI database alone, they contain over 1200 publicly available projects comprising of ~75000 samples and ~100000 runs. The majority of this data are 16S rRNA gene amplicon datasets, followed by WGS metagenomic datasets with a smaller number of metatranscriptomic studies and assemblies (Mitchell et al, 2018).
    • There are endless possibilities of where sequencing projects can be sourced. For example, the EBI database contains biomes such as engineered, soil, freshwater, marine, plants, mammals and humans. Most sequencing are from are from the gut, soil, sediments and aquatic environments. These environments are popular for sequencing because the inhabitants are hard to culture in a lab setting (Mitchell et al, 2018).
  • What types of on-line resources are available for warehousing and/or analyzing environmental sequence information (provide names, URLS and applications)?

  • What is the difference between phylogenetic and functional gene anchors and how can they be used in metagenome analysis?

Phylogenetic Functional
Vertical gene transfer horizontal gene transfer
Carry phylogenetic information allowing tree reconstruction identify specific biogeochemical functions associated with measurable effects
taxonomic not as useful for phylogeny
ideally single-copy
+ In metagenomics, phylogenetic gene anchors allow tree reconstruction based on lineage, determined from vertical gene transfer. Meanwhile, function gene anchors permit the identification of the phenotypic composition and structure of the microbe. For example, various types of functional gene arrays have been developed to analyze gene families involved in biogeochemical cycles (Li *et al*, 2014).  
  • What is metagenomic sequence binning? What types of algorithmic approaches are used to produce sequence bins? What are some risks and opportunities associated with using sequence bins for metabolic reconstruction of uncultivated microorganisms?

    • In a community of sequences, the algorithm of sequences will identity the sequences and cluster them based on what it thinks as the same group, assigning them to OTUs.The purpose is to reconstruct the genome of an organism/group of organism from segments. Bin is all of the variation in a population of sequence of the same genome from a community (Papudehi et al, 2017).
    • There are two types of algorithms:
      • Based on sequence alignment to databases
      • Based on organism-specific characteristic: GC content, codon usage
    • Metrics of the good bin includes
      • percentage completeness of sequence
      • percentage of contamination
    • The risks can include
      • Incomplete coverage of genomes sequence
      • Contamination from different phylogeny
    • The opportunities associated with using sequence bins for metabolic reconstruction is to reveal the metabolic potential, or pathogenicity in uncultivated species that remain extremely difficult to study in the laboratory setting. In addition, samples are sequenced soon after being taken directly from the environment with a significantly reduced opportunity for bias-enducing effects to afflict the sample (Papudehi et al, 2017).
  • Is there an alternative to metagenomic shotgun sequencing that can be used to access the metabolic potential of uncultivated microorganisms? What are some risks and opportunities associated with this alternative?
    • Alternative methods include:
      • Functional screens:
        • allows the detection and identification of bacteria expressing key enzyme activities under specific conditions, as well as studying various microbial catabolic diversity (Rastogi et al, 2011).
        • successful functional metagnomic screens lies int he availability and interpretation of environmental parameter data. Knowledge and intuition about sampling environments are important in selecting appropriate screening targets and subtrates (Rastogi et al, 2011).
        • Biochemical screes can also introduce biases to mostly favourable and dominant communities under specific conditions (Rastogi et al, 2011).
      • Fluorescence in situ hybridization (FISH) probe:
        • enables in situ phylogenetic identification and enumeration of individual microbial cells by whole cell hybridization with oligonucleotide probes (Rastogi et al, 2011).
        • FISH can be combined with flow cytometry for a highresolution automated analysis of mixed microbial populations (Rastogi et al, 2011).
        • Low signal intensity, background fluorescence, and target inaccessibility are commonly encountered problems in FISH analysis (Rastogi et al, 2011).

Problem set 03 references

  1. Solden, L, Lloyd, K, Wrighton, K. 2016. The bright side of microbial dark matter: lessons learned from the uncultivated majority. Curr. Opin. Microbiol. 31:217-226. PMID 27196505

  2. Rapple, MS, Giovannoni, SJ. 2003. The uncultured microbial majority. Annual Reviews in Microbiology. 57:369-394.PMID 14527284

  3. Alain, K, Querellou, J. 2009. Cultivating the uncultured: limits, advances and future challenges. Extremophiles. 13:583-594.PMID 19548063

  4. Parks, DH, Rinke, C, Chuvochina, M, Chaumeil, P, Woodcroft, BJ, Evans, PN, Hugenholtz, P, Tyson, GW. 2017. Recovery of nearly 8,000 metagenome-assembled genomes substantially expands the tree of life. Nature Microbiology. 2:1533.PMID28894102

  5. Mitchell, AL, Scheremetjew, M, Denise, H, Potter, S, Tarkowska, A, Qureshi, M, Salazar, GA, Pesseat, S, Boland, MA, Hunter, FMI. 2017. EBI Metagenomics in 2017: enriching the analysis of microbial communities, from sequence reads to assemblies. Nucleic Acids Res. 46:D735. PMID 29069476

  6. Prakash, T, Taylor, TD. 2012. Functional assignment of metagenomic data: challenges and applications. Briefings in Bioinformatics. 13:711-727.PMID 3504928

  7. Li, Y, He, J, He, Z, Zhou, Y, Yuan, M, Xu, X, Sun, F, Liu, C, Li, J, Xie, W. 2014. Phylogenetic and functional gene structure shifts of the oral microbiomes in periodontitis patients. The ISME Journal. 8:1879.PMID 4139721

  8. Papudeshi, B, Haggerty, JM, Doane, M, Morris, MM, Walsh, K, Beattie, DT, Pande, D, Zaeri, P, Silva, GG, Thompson, F. 2017. Optimizing and evaluating the reconstruction of Metagenome-assembled microbial genomes. BMC Genomics. 18:915.PMID 5706307

  9. Rastogi, G, Sani, RK. 2011. Molecular techniques to assess microbial community structure, function, and dynamics in the environment. In Anonymous Microbes and microbial technology. Springer.PMID: 10782339

Module 02 references

Alain, K, Querellou, J. 2009. Cultivating the uncultured: limits, advances and future challenges. Extremophiles. 13:583-594.PMID 19548063

Li, Y, He, J, He, Z, Zhou, Y, Yuan, M, Xu, X, Sun, F, Liu, C, Li, J, Xie, W. 2014. Phylogenetic and functional gene structure shifts of the oral microbiomes in periodontitis patients. The ISME Journal. 8:1879.PMID 4139721

Madsen, EL. 2005. Identifying microorganisms responsible for ecologically significant biogeochemical processes. Nature Reviews Microbiology. 3:439. PMID15864265

Martinez A, Bradley AS, Waldbauer JR, Summons RE, BeLong EF. 2007. Proteorhodopsin photosystem gene expression enables photophosphorylation in a heterologous host. Proc Natl Acad Sci U.S.A. 104(13):5590-5595. PMC1838496

Mitchell, AL, Scheremetjew, M, Denise, H, Potter, S, Tarkowska, A, Qureshi, M, Salazar, GA, Pesseat, S, Boland, MA, Hunter, FMI. 2017. EBI Metagenomics in 2017: enriching the analysis of microbial communities, from sequence reads to assemblies. Nucleic Acids Res. 46:D735. PMID 29069476

Parks, DH, Rinke, C, Chuvochina, M, Chaumeil, P, Woodcroft, BJ, Evans, PN, Hugenholtz, P, Tyson, GW. 2017. Recovery of nearly 8,000 metagenome-assembled genomes substantially expands the tree of life. Nature Microbiology. 2:1533.PMID28894102

Papudeshi, B, Haggerty, JM, Doane, M, Morris, MM, Walsh, K, Beattie, DT, Pande, D, Zaeri, P, Silva, GG, Thompson, F. 2017. Optimizing and evaluating the reconstruction of Metagenome-assembled microbial genomes. BMC Genomics. 18:915.PMID 5706307

Prakash, T, Taylor, TD. 2012. Functional assignment of metagenomic data: challenges and applications. Briefings in Bioinformatics. 13:711-727.PMID 3504928

Rapple, MS, Giovannoni, SJ. 2003. The uncultured microbial majority. Annual Reviews in Microbiology. 57:369-394.PMID 14527284

Rastogi, G, Sani, RK. 2011. Molecular techniques to assess microbial community structure, function, and dynamics in the environment. In Anonymous Microbes and microbial technology. Springer.PMID 10782339

Taupp, M, Mewis, K, Hallam, SJ. 2011. The art and design of functional metagenomic screens. Curr. Opin. Biotechnol. 22:465-472.PMID21440432

Solden, L, Lloyd, K, Wrighton, K. 2016. The bright side of microbial dark matter: lessons learned from the uncultivated majority. Curr. Opin. Microbiol. 31:217-226. PMID 27196505

Wooley, JC, Godzik, A, Friedberg, I. 2010. A primer on metagenomics. PLoS Computational Biology. 6:e1000667.PMID20195499

Module 03

Microbial Species Concepts

Problem set 04

Learning objectives:

  • Gain experience estimating diversity within a hypothetical microbial community

Part 1: Description and enumeration

Obtain a collection of “microbial” cells from “seawater”. The cells were concentrated from different depth intervals by a marine microbiologist travelling along the Line-P transect in the northeast subarctic Pacific Ocean off the coast of Vancouver Island British Columbia.

Sort out and identify different microbial “species” based on shared properties or traits. Record your data in this Rmarkdown using the example data as a guide.

Once you have defined your binning criteria, separate the cells using the sampling bags provided. These operational taxonomic units (OTUs) will be considered separate “species”. This problem set is based on content available at What is Biodiversity.

For example, load in the packages you will use.

#To make tables
library(kableExtra)
library(knitr)
#To manipulate and plot data
library(tidyverse)

Then load in the data. You should use a similar format to record your community data. Finally, use these data to create a table.

For your sample:

  • Construct a table listing each species, its distinguishing characteristics, the name you have given it, and the number of occurrences of the species in the collection.

*Data has been collected in excel to output a tsv file

CandyCommunity = read.table (file="Candy Seawater.txt", header=TRUE, row.names=1, sep="\t", na.strings=c("NAN","NA","."))
CandySample_V2 = read.table (file="Candy Sample.txt", header=TRUE, row.names=1, sep="\t", na.strings=c("NAN","NA","."))
  • Ask yourself if your collection of microbial cells from seawater represents the actual diversity of microorganisms inhabiting waters along the Line-P transect. Were the majority of different species sampled or were many missed?
  • It’s difficult to conclude whether my collection represents the actual diversity without any further analysis. As collection methods are unknown, it is uncertain if there was any sampling bias, or if there were sufficient number of samples taken from different areas of the seawater.

Part 2: Collector’s curve

To help answer the questions raised in Part 1, you will conduct a simple but informative analysis that is a standard practice in biodiversity surveys. This analysis involves constructing a collector’s curve that plots the cumulative number of species observed along the y-axis and the cumulative number of individuals classified along the x-axis. This curve is an increasing function with a slope that will decrease as more individuals are classified and as fewer species remain to be identified. If sampling stops while the curve is still rapidly increasing then this indicates that sampling is incomplete and many species remain undetected. Alternatively, if the slope of the curve reaches zero (flattens out), sampling is likely more than adequate.

To construct the curve for your samples, choose a cell within the collection at random. This will be your first data point, such that X = 1 and Y = 1. Next, move consistently in any direction to a new cell and record whether it is different from the first. In this step X = 2, but Y may remain 1 or change to 2 if the individual represents a new species. Repeat this process until you have proceeded through all cells in your collection.

Create a plot. We will use a scatterplot (geom_point) to plot the raw data and then add a smoother to see the overall trend of the data.

For your sample:

  • Create a collector’s curve.

*Data has been collected in excel to output a tsv file

Collectors_Curve_V2 = read.table (file="Collectors Curve.txt", header=TRUE, row.names=1, sep="\t", na.strings=c("NAN","NA","."))
ggplot(Collectors_Curve_V2, aes(x=x, y=y)) +
  geom_point() +
  geom_smooth() +
  labs(x="Cumulative number of individuals classified", y="Cumulative number of species observed")
## `geom_smooth()` using method = 'loess'

  • Does the curve flatten out? If so, after how many individual cells have been collected?
  • Yes , the curve has flattened out after 38 samples have been collected

  • What can you conclude from the shape of your collector’s curve as to your depth of sampling?
  • The collector’s curve is an increasing function with a slope that decreases as more individuals are classiied and less species remain to be identified. The curve is appraoching zero slope at the top, meaning that few to no new species remain identified. Sampling is likely to be adequate.

Part 3: Diversity estimates (alpha diversity)

Using the table from Part 1, calculate species diversity using the following indices or metrics.

Diversity: Simpson Reciprocal Index

\(\frac{1}{D}\) where \(D = \sum p_i^2\)

\(p_i\) = the fractional abundance of the \(i^{th}\) species

The higher the value is, the greater the diversity. The maximum value is the number of species in the sample, which occurs when all species contain an equal number of individuals. Because the index reflects the number of species present (richness) and the relative proportions of each species with a community (evenness), this metric is a diveristy metric. Consider that a community can have the same number of species (equal richness) but manifest a skewed distribution in the proportion of each species (unequal evenness), which would result in different diveristy values.

  • What is the Simpson Reciprocal Index for your sample?

  • The Simpson Reciprocal Index calculated in excel.
  • Community sample: 23.17449389
  • Individual Sample: 24.96018735

Richness: Chao1 richness estimator

Another way to calculate diversity is to estimate the number of species that are present in a sample based on the empirical data to give an upper boundary of the richness of a sample. Here, we use the Chao1 richness estimator.

\(S_{chao1} = S_{obs} + \frac{a^2}{2b})\)

\(S_{obs}\) = total number of species observed a = species observed once b = species observed twice or more

So for our previous example community of 3 species with 2, 4, and 1 individuals each, \(S_{chao1}\) =

3 + 1^2/(2*2)
## [1] 3.25
  • What is the chao1 estimate for your sample?

Community sample:

51 + (13^2/(2*38))
## [1] 53.22368

Individual sample:

39 + (13^2/(2*26))
## [1] 42.25

Part 4: Alpha-diversity functions in R

We’ve been doing the above calculations by hand, which is a very good exercise to aid in understanding the math behind these estimates. Not surprisingly, these same calculations can be done with R functions. Since we just have a species table, we will use the vegan package. You will need to install this package if you have not done so previously.

library(vegan)

First, we must remove the unnecesary data columns and transpose the data so that vegan reads it as a species table with species as columns and rows as samples (of which you only have 1).

Then we can calculate the Simpson Reciprocal Index using the diversity function.

And we can calculate the Chao1 richness estimator (and others by default) with the the specpool function for extrapolated species richness. This function rounds to the nearest whole number so the value will be slightly different that what you’ve calculated above.

In Project 1, you will also see functions for calculating alpha-diversity in the phyloseq package since we will be working with data in that form.

For your sample:

  • What is the Simpson Reciprocal Index using the R function?

For community sample:

CandyCommunity_diversity = 
     CandyCommunity %>% 
     select(name, occurences) %>% 
     spread(name, occurences)
diversity(CandyCommunity_diversity, index="invsimpson")
## [1] 23.17449

For individual sample:

CandySample_diversity_V2 = 
     CandySample_V2 %>% 
     select(name, occurences) %>% 
     spread(name, occurences)
diversity(CandySample_diversity_V2, index="invsimpson")
## [1] 24.96019
  • What is the chao1 estimate using the R function?

For Community Sample:

specpool(CandyCommunity_diversity)
##     Species chao chao.se jack1 jack1.se jack2 boot boot.se n
## All      51   51       0    51        0    51   51       0 1

For Individual Sample:

specpool(CandySample_diversity_V2)
##     Species chao chao.se jack1 jack1.se jack2 boot boot.se n
## All      39   39       0    39        0    39   39       0 1

*Verify that these values match your previous calculations.
The Simpson Reciprocal Index have mathcing numbers for both individual and community samples with manhual calculation and R function. However, the Chao1 values do not match.

Part 5: Concluding activity

If you are stuck on some of these final questions, reading the Kunin et al. 2010 and Lundin et al. 2012 papers may provide helpful insights.

  • How does the measure of diversity depend on the definition of species in your samples?
    • depending on the definition of species in my sample, the measure of diversity will change based on how I grouped the different organisms (pieces of candies) according to their characteristics. The measure of diversity will decrease if there are more organisms grouped in the same specie taxa, resulting in less types of organisms in the community sample. As a result, the alphadiversity and Chao1 value will both decrease.
  • Can you think of alternative ways to cluster or bin your data that might change the observed number of species?
    • the observed number of species will also changed based on what I define as “a single species”. For example, the twizzlers are long and filamentus. We defined a clump of it as one species while others may define each string as a species.
  • How might different sequencing technologies influence observed diversity in a sample?

    • The parameter for pipielines during sequencing. For example, the sequencing platform, sample prep (consistent prrocessing of sample to get their DNA), and if the pipelines looks at the same gene region (ex. within 16s RNA) to identify different individuals.

Evidence Worksheet 05

Welch et al 2002

Part 1: Learning objectives

Evaluate the concept of microbial species based on environmental surveys and cultivation studies.
Explain the relationship between microdiversity, genomic diversity and metabolic potential.
Comment on the forces mediating divergence and cohesion in natural microbial communities.

General questions

  • What were the main questions being asked?
    • To understand the genetic bases for pathogenicity and the evolutionary diversity of E. coli by using a model that compares genomic differences in the uropathogenic CTF073 strain, the enterohemorrhagic strain EDL933 and the non-pathogenic lab strain MG1655.
    • How the various strains under the same species differ or relate genetically and phenotypically.
    • Where did the pathogenicity islands that gave rise to various ecotypes come from?
    • How the process of gene transfer changes the ecotype through the acquisition of the gene islands
    • The relationship between micro diversity, genomic diversity and metabolic potential.
  • What were the primary methodological approaches used?
    • Automated Sanger Sequencing (dye-terminator chemistry) across 3700 machines on CFT073. Then, the sequence was annotated through the MAGPIE pieline and the predicted proteins were searched against BLAST.
  • Summarize the main results or findings.

    • Although under the same species, strains MG1655, EDL933, and CFT073 have distinct ecotypes.
    • More than 70% of unique ORFs to the MG1655 or EDL933 have been replaced with new genes specific to CFT073. All three strains have 39.2% similarity in their proteins.
    • Genes in the backbone have been acquired through vertical transmission and have relatively low changes between CFT073 and EDL933, but Island genes in CFT073 have distinctive codon than the backbone, indicating the source of the island genes to be lateral gene transfer.
    • The island-specific genes are also very different from the other investigated strains, as there were only 204 out of 2004 shared island-specific gene with EDL933.
    • The typeIII secretion system that confer disease potential in EDL933, common to E.coli O157:H7, were absent in CFT073. Instead, CFT073 islands has specific fimbrial adhesins, secreted autotrasporters and phase-switch recombinases.These genes encode for adaptive traits that helps the strain select for fitness to colonize the urinary tract, leading to ecotype and pathogenesis.
    • In summary, island acquisition through HGT resulted in the capability to infect the urinary tract without compromising the ability to harmeslessly colonize the intesting for the uropathogenic strain of E. coli.
  • Do new questions arise from the results?
    • The results from the paper suggests possible further investigation into other species to see if there are also such big genotype and phenotype differences between the pathogenic and non-pathogenic strains, and if the distinct phenoypes were conferred by islands obtained through horizontal gene transfer.
    • In addition, the authors concluded that the genes preserved in the backbone allows CFT073 to survive in the intestine, even though it is a uropathogenic strain. They questioned where are the presence of “black holes”, aka deletions that remove genes in the backbone detrimental to the uropathogenic lifestyle. At this time it is difficult to assess because of the large number of genetic differences already observed, such that additional E. coli genome sequences are needed for comparisons.
    • This paper has revealed the existance of large differences in ecotypes for organisms classified under the same species. It begs to question if the current species definition should be redefined, and what would be reasonable way to classify species?
  • Were there any specific challenges or advantages in understanding the paper (e.g. did the authors provide sufficient background information to understand experimental logic, were methods explained adequately, were any specific assumptions made, were conclusions justified based on the evidence, were the figures or tables useful and easy to understand)?

    • The paper had a short methods section that does not have enough explanations on how the sequencing techniques and technologies were used to come to the conclusion regarding the different pathogenicity islands/genome differences. As well, the assumptions involved in the analysis was also not obvious to the reader. In addition, I had difficulties following the sections talking about the details of the gene clusters in the pathogenicity islands. There wasn’t enough background information provided to understand how specific operons play a role in the fimbriae, for example.

Part 2: Learning objectives

Comment on the creative tension between gene loss, duplication and acquisition as it relates to microbial genome evolution.
Identify common molecular signatures used to infer genomic identity and cohesion.
Differentiate between mobile elements and different modes of gene transfer.

Specific question

  • Based on your reading and discussion notes, explain the meaning and content of the following figure derived from the comparative genomic analysis of three E. coli genomes by Welch et al. Remember that CFT073 is a uropathogenic strain and that EDL933 is an enterohemorrhagic strain. Explain how this study relates to your understanding of ecotype diversity. Provide a definition of ecotype in the context of the human body. Explain why certain subsets of genes in CFT073 provide adaptive traits under your ecological model and speculate on their mode of vertical descent or gene transfer.
    • Ecotype can be defined as genetically similar but occupy distinct ecological niches. Ecotype possesses all properties of an organism, such as biochemical properties, functional phenotypes and virulence factors. In the human context, different strains of E. Coli can occupy different habitats in the human body because they have different niches and have different pathogenicities. For example: the uropathogenic strain CFT073 can colonize the urinary tract, enterohemorrhagic strain EDL933 in the intestines.
    • The diagram shows the location and sizes of CFT073 and EDL933 islands in the backbone. As we can see, most of the backbone carries identical genes between the two strains, except for gene clusters in the strain-specific islands. These islands often contain pathogenicity genes that confers an unique ecotype to the organism. For example, the pap operons present in pheV and pheU genes in the islands partly mediate the specific attachment for the urinary tract, is found in CFT073 but not EDL933, providing evidence for the variation in ecotypes.
    • Codon usage analysis revealed a high similarly in codon usage in the backbone E. coli, and concluded that these genes with a shared codon bias are not seen in the genes unique in each of the two strains. The common genes in the backbone of the E. coli genome has been preserved throughout its vertical evolution. Meanwhile, there was distinct codon usage in the island-specific genes between the two strains, indicating origins from horizontal gene transfer of mobile elements. These HGT gene clusters involved in novel metabolic functions provide a fitness benefit enabling them to be readily transferred.
    • In summary, the island-specific genes obtained from HGT in the uropathogenic strain has provided the organism with a distinct functional phenotype. Through environmental selection, these genes eventually allowed it to adapt to the urinary tract. However, the genes in the backbone inherited from vertical transfer still permits the uropathogenic strain to survive in intestinal environments.

Writing assessment 03

Discuss the challenges involved in defining a microbial species and how HGT complicates matters, especially in the context of the evolution and phylogenetic distribution of microbial metabolic pathways. Can you comment on how HGT influences the maintenance of global biogeochemical cycles through time? Finally, do you think it is necessary to have a clear definition of a microbial species? Why or why not?

In the prokaryotic world, horizontal gene transfer (HGT) is a driving force of microbial evolution under selective pressures. Currently, evolutionary phylogenetic relationships are established through metagenomic analysis using DNA hybridization or gene sequencing. However, the contribution of HGT in evolution complicates the reconstruction of phylogenies. The traditional definition of species may no longer be suitable for classification and must be refined with consideration of HGT events. This essay will explore the current standpoint of HGT and species definition from three perspectives: how the distribution of core metabolic genes via HGT impact the global biogeochemical cycle, current challenges in species classification caused by HGT, and the possibility of a redefined classification system incorporating ecological roles in addition to defining microbial species by genetic information.

The planetary genes involved in metabolic activities that shaped the evolution and the maintenance of global biogeochemical pathways were in part distributed by HGT across the three domains of life. HGT has the flexibility to allow organisms transfer genes to closely related species or even distantly related groups at times (1). Take the photosynthetic metabolism in the oxygen cycle for example, evidence suggest genes that were part of the primitive photosystems in the anoxygenic ancestor, were disseminated by HGT between different groups of bacteria (2). Specifically, the observation of highly similar photosynthesis genes in two phylogenetically distinct groups of purple bacteria suggest that the puf superoperon encoding the entire anoxygenic photosynthetic apparatus was horizontally transferred between the ?? and ?? subclasses (3,4). In addition to acquiring metabolic functions through HGT, various environmental selective pressures serve as major drivers for the retention or evolution these genes. For instance, anoxygenic phototrophs subjected to an anaerobic environment preserved the ancestral features of the photosynthetic apparatus to maximize energy output (2). Meanwhile, the selective pressures of UV light and depletion of electron donors pushed the evolution of anoxygenic type I and II reaction centers into oxygenic photosystems I and II in cyanobacteria (2). In other words, gene clusters involved in a single metabolic function provide a fitness benefit enabling them to be readily transferred. The wide-spread distribution of certain metabolic genes through HGT prevents the extinction of the metabolic pathway from environmental perturbations (1). Although HGT has in part shaped and maintained the biogeochemical cycles, it complicates the Darwinian model of microbial classification and species definition due to the ability to acquire novel genes in a single generation that can vastly diversify both the genotype and phenotype of organisms.

The prokaryotic species definition remains a challenging issue to advance in the face of HGT. Currently the two most common genetic approaches integrated into species definition are by DNA-DNA hybridization or the rRNA gene-sequence approach (5). As pioneered by Woese, SSU or 16S ribosomal RNA (rRNA) has been used to identify phylogenetic relationship between organisms with a universally distributed gene (5). In metagenomics, classification of organisms employs comparing rRNA gene sequences in mixed reads from different species in a community (7). These reads are binned into operational taxonomic units (OTUs) based on genomic similarities and functional characteristics, which are obtained from sequence databases (7). The challenge with this process comes from difficulties in discerning which species the binned gene came from (7). As no gene appears to be immune to HGT, including informational genes such as rRNA (1), the intragenomic heterogeneity of rRNA genes from HGT could result in overestimation of the community diversity (8). The current gold standard approach to delineate a species is to exhibit 70% DNA-DNA binding, representing relatedness in gene content and nucleotide similarity (5). While this species definition is appropriate in eukaryotes, there are serious limitations in prokaryotes. The fact that 30% variation in a single bacterial species can be as diverse as an entire vertebrate order (9), suggests that a wide range of phenotypic differences exists within each bacterial species. For example, Escherichia coli (E. coli) strains CFT073, EDL933 and MG1655 are classified within the same species, but their pathogenicity and ecological niche differ widely. CFT073 is uropathogenic, while EDL933 is enterohemorrhagic and MG1655 is a non-pathogenic laboratory strain (10). Variations in ecological niche is in part due to the establishment of CFT073-unique island genes from HGT events. These genes encode for tissue-specific fimbriae and pilis that help the bacteria colonize new niches and establish pathogenic mechanisms in the urinary tract. (10) In contrast, if we apply the same 70% genetic similarity criteria to eukaryotes, humans would be considered the same species as rodents (11). In summary, HGT provides a paradigm for generating rich ecotypes that is limited by the traditional species definition. There is a need for redefined means of classifying prokaryote species.

It is important to have a clear definition of species, but there is a need to re-consider classification based on solely on genetic similarity. Species definition play a vital role in clinical settings and public health. The significance of labelling is to allow scientists and clinicians to integrate findings on bacterial diseases and make consistent clinical diagnosis (12). However, the current species definition confers little evolutionary and ecological meaning in prokaryotes, partly due to HGT as mentioned previously. In addition, classifying new organisms based solely on genetic analysis is essentially meaningless, since clinical laboratories continue to rely on biochemical properties to identify bacterial strains rather than gene sequencing (12). Thus, species definition may be refined through a concept that combines theoretical definition (DNA analysis) and operational definition (functional phenotypes). While frequent gene transfer events between closely-related organisms makes lineages “fuzzy” at the species level, sequence similarity approaches can still reveal larger evolutionary patterns for classification (1). A more meaningful definition for species can involve incorporating the concept of ecotypes. Bacteria ecotypes have been defined as genetically similar but occupy distinct ecological niches (5). Under the E. coli species level, all strains exhibiting the same ecotype can be classified into a single group. For example, all uropathogenic E. coli would be grouped together as one “strain” regardless of genetic variation. This refined species definition will continue to hold medical significance as ecotypes possess all properties of an organism, such as biochemical properties, functional phenotypes, and virulence factors.

Horizontal gene transfer is an important force modulating microbial evolution, but complicates the current system of species definition. During microbial evolution, HGT events have resulted in phylogenetic distribution of microbial metabolic pathways by propagating genes that confer higher level of fitness under environmental selective pressures. Metabolic pathways such as photosynthesis that are acquired through HGT have influenced the early biogeochemical cycles and continue to maintain them today. However, HGT complicates the process of defining microbial species at the genetic level due to its influences on rRNA and its role in intra-species diversification. Although HGT renders defining species challenging, it is still essential to have a clear definition of species for clinical diagnosis and treatments. A marriage between genetic similarity and operational classification has been proposed to refine the current species definition by grouping together various strains that share the same ecotype within the species level. With rapid advances in molecular and genome biology, a sequence-based framework that has sufficient flexibility to accommodate the vast differences in biology may revolutionize taxonomy.

Writing assessment 03 references

  1. Gogarten, JP, Doolittle, WF, Lawrence, JG. 2002. Prokaryotic evolution in light of gene transfer. Mol. Biol. Evol. 19:2226-2238.PMID12446813

  2. Mulkidjanian, AY, Koonin, EV, Makarova, KS, Mekhedov, SL, Sorokin, A, Wolf, YI, Dufresne, A, Partensky, F, Burd, H, Kaznadzey, D. 2006. The cyanobacterial genome core and the origin of photosynthesis. Proceedings of the National Academy of Sciences. 103:13126-13131.PMID16924101

  3. Igarashi, N, Harada, J, Nagashima, S, Matsuura, K, Shimada, K, Nagashima, KV. 2001. Horizontal transfer of the photosynthesis gene cluster and operon rearrangement in purple bacteria. J. Mol. Evol. 52:333-341.PMID11343129

  4. Nagashima, KV, Hiraishi, A, Shimada, K, Matsuura, K. 1997. Horizontal transfer of genes coding for the photosynthetic reaction centers of purple bacteria. J. Mol. Evol. 45:131-136.PMID9236272

  5. Gevers, D, Cohan, FM, Lawrence, JG, Spratt, BG, Coenye, T, Feil, EJ, Stackebrandt, E, Van de Peer, Y, Vandamme, P, Thompson, FL. 2005. Re-evaluating prokaryotic species. Nature Reviews Microbiology. 3:733.PMID16138101

  6. Lawrence, JG. 2002. Gene transfer in bacteria: speciation without species? Theor. Popul. Biol. 61:449-460.PMID12167364

  7. Tamames, J, Moya, A. 2008. Estimating the extent of horizontal gene transfer in metagenomic sequences. BMC Genomics. 9:136.PMID18366724

  8. Tian, R, Cai, L, Zhang, W, Cao, H, Qian, P. 2015. Rare events of intragenus and intraspecies horizontal transfer of the 16S rRNA gene. Genome Biology and Evolution. 7:2310-2320.PMID 26220935

  9. Doolittle, WF, Zhaxybayeva, O. 2009. On the origin of prokaryotic species. Genome Res. 19:744-756.
  10. Welch, RA, Burland, V, Plunkett, G, Redford, P, Roesch, P, Rasko, D, Buckles, EL, Liou, S, Boutin, A, Hackett, J. 2002. Extensive mosaic structure revealed by the complete genome sequence of uropathogenic Escherichia coli. Proceedings of the National Academy of Sciences. 99:17020-17024.PMID12471157

  11. Church, DM, Goodstadt, L, Hillier, LW, Zody, MC, Goldstein, S, She, X, Bult, CJ, Agarwala, R, Cherry, JL, DiCuccio, M. 2009. Lineage-specific biology revealed by a finished genome assembly of the mouse. PLoS Biology. 7:e1000112.PMID19468303

  12. Janda, JM, Abbott, SL. 2002. Bacterial identification for publication: when is enough enough? J. Clin. Microbiol. 40:1887-1891.PMID12037039

Module 03 references

Allali, I, Arnold, JW, Roach, J, Cadenas, MB, Butz, N, Hassan, HM, Koci, M, Ballou, A, Mendoza, M, Ali, R. 2017. A comparison of sequencing platforms and bioinformatics pipelines for compositional analysis of the gut microbiome. BMC Microbiology. 17:194.PMID28903732

Breitburg, D, Levin, LA, Oschlies, A, Gregoire, M, Chavez, FP, Conley, DJ, Gareon, V, Gilbert, D, Gutierrez, D, Isensee, K. 2018. Declining oxygen in the global ocean and coastal waters. Science. 359:eaam7240.PMID29301986

Callahan, BJ, McMurdie, PJ, Holmes, SP. 2017. Exact sequence variants should replace operational taxonomic units in marker-gene data analysis. The ISME Journal. 11:2639.PMID28731476

Chen, W, Zhang, CK, Cheng, Y, Zhang, S, Zhao, H. 2013. A comparison of methods for clustering 16S rRNA sequences into OTUs. PloS One. 8:e70837.PMID23967117

Church, DM, Goodstadt, L, Hillier, LW, Zody, MC, Goldstein, S, She, X, Bult, CJ, Agarwala, R, Cherry, JL, DiCuccio, M. 2009. Lineage-specific biology revealed by a finished genome assembly of the mouse. PLoS Biology. 7:e1000112.PMID19468303

Cordero, OX, Ventouras, L, DeLong, EF, Polz, MF. 2012. Public good dynamics drive evolution of iron acquisition strategies in natural bacterioplankton populations. Proceedings of the National Academy of Sciences. 109:20059-20064.PMID23169633

Doolittle, WF, Zhaxybayeva, O. 2009. On the origin of prokaryotic species. Genome Res. 19:744-756. 10. Welch, RA, Burland, V, Plunkett, G, Redford, P, Roesch, P, Rasko, D, Buckles, EL, Liou, S, Boutin, A, Hackett, J. 2002. Extensive mosaic structure revealed by the complete genome sequence of uropathogenic Escherichia coli. Proceedings of the National Academy of Sciences. 99:17020-17024.PMID12471157

Edgar, RC. 2017. Updating the 97% identity threshold for 16S ribosomal RNA OTUs. bioRxiv. 192211.PMID29506021

Finotello, F, Mastrorilli, E, Di Camillo, B. 2016. Measuring the diversity of the human microbiota with targeted next-generation sequencing. Briefings in Bioinformatics.PMID28025179

Gaudet, AD, Ramer, LM, Nakonechny, J, Cragg, JJ, Ramer, MS. 2010. Small-group learning in an upper-level university biology class enhances academic performance and student attitudes toward group work. PloS One. 5:e15821.PMID21209910

Gevers, D, Cohan, FM, Lawrence, JG, Spratt, BG, Coenye, T, Feil, EJ, Stackebrandt, E, Van de Peer, Y, Vandamme, P, Thompson, FL. 2005. Re-evaluating prokaryotic species. Nature Reviews Microbiology. 3:733.PMID16138101

Giovannoni, SJ. 2012. Vitamins in the sea. Proceedings of the National Academy of Sciences. 109:13888-13889.PMID22891350

Gogarten, JP, Doolittle, WF, Lawrence, JG. 2002. Prokaryotic evolution in light of gene transfer. Mol. Biol. Evol. 19:2226-2238.PMID12446813

Hallam, SJ, Torres-Beltran, M, Hawley, AK. 2017. Monitoring microbial responses to ocean deoxygenation in a model oxygen minimum zone. Scientific Data. 4:16.PMID29087370

Hawley, AK, Torres-Beltran, M, Zaikova, E, Walsh, DA, Mueller, A, Scofield, M, Kheirandish, S, Payne, C, Pakhomova, L, Bhatia, M. 2017. A compendium of multi-omic sequence information from the Saanich Inlet water column. Scientific Data. 4:170160.PMID9087368

Igarashi, N, Harada, J, Nagashima, S, Matsuura, K, Shimada, K, Nagashima, KV. 2001. Horizontal transfer of the photosynthesis gene cluster and operon rearrangement in purple bacteria. J. Mol. Evol. 52:333-341.PMID11343129

Ito, T, Minobe, S, Long, MC, Deutsch, C. 2017. Upper ocean O2 trends: 1958-2015. Geophys. Res. Lett. 44:4214-4223.https://doi.org/10.1002/2017GL073613

Janda, JM, Abbott, SL. 2002. Bacterial identification for publication: when is enough enough? J. Clin. Microbiol. 40:1887-1891.PMID12037039

Kunin, V, Engelbrektson, A, Ochman, H, Hugenholtz, P. 2010. Wrinkles in the rare biosphere: pyrosequencing errors can lead to artificial inflation of diversity estimates. Environ. Microbiol. 12:118-123.PMID19725865

Lawrence, JG. 2002. Gene transfer in bacteria: speciation without species? Theor. Popul. Biol. 61:449-460.PMID12167364

Loman, NJ, Misra, RV, Dallman, TJ, Constantinidou, C, Gharbia, SE, Wain, J, Pallen, MJ. 2012. Performance comparison of benchtop high-throughput sequencing platforms. Nat. Biotechnol. 30:434.PMID22522955

Lundin, D, Severin, I, Logue, JB, Östman, Ö, Andersson, AF, Lindström, ES. 2012. Which sequencing depth is sufficient to describe patterns in bacterial ??-and ??-diversity? Environmental Microbiology Reports. 4:367-372.PMID23760801

Mulkidjanian, AY, Koonin, EV, Makarova, KS, Mekhedov, SL, Sorokin, A, Wolf, YI, Dufresne, A, Partensky, F, Burd, H, Kaznadzey, D. 2006. The cyanobacterial genome core and the origin of photosynthesis. Proceedings of the National Academy of Sciences. 103:13126-13131.PMID16924101

Nagashima, KV, Hiraishi, A, Shimada, K, Matsuura, K. 1997. Horizontal transfer of genes coding for the photosynthetic reaction centers of purple bacteria. J. Mol. Evol. 45:131-136.PMID9236272

Schloss, PD, Westcott, SL, Ryabin, T, Hall, JR, Hartmann, M, Hollister, EB, Lesniewski, RA, Oakley, BB, Parks, DH, Robinson, CJ. 2009. Introducing mothur: open-source, platform-independent, community-supported software for describing and comparing microbial communities. Appl. Environ. Microbiol. 75:7537-7541.PMID19801464

Sogin, ML, Morrison, HG, Huber, JA, Welch, DM, Huse, SM, Neal, PR, Arrieta, JM, Herndl, GJ. 2006. Microbial diversity in the deep sea and the underexplored “rare biosphere”. Proceedings of the National Academy of Sciences. 103:12115-12120.PMID16880384

Tamames, J, Moya, A. 2008. Estimating the extent of horizontal gene transfer in metagenomic sequences. BMC Genomics. 9:136.PMID18366724

Tian, R, Cai, L, Zhang, W, Cao, H, Qian, P. 2015. Rare events of intragenus and intraspecies horizontal transfer of the 16S rRNA gene. Genome Biology and Evolution. 7:2310-2320.PMID 26220935

Thompson, JR, Pacocha, S, Pharino, C, Klepac-Ceraj, V, Hunt, DE, Benoit, J, Sarma-Rupavtarm, R, Distel, DL, Polz, MF. 2005. Genotypic diversity within a natural coastal bacterioplankton population. Science. 307:1311-1313.PMID15731455

Torres-Beltran, M, Hawley, AK, Capelle, D, Zaikova, E, Walsh, DA, Mueller, A, Scofield, M, Payne, C, Pakhomova, L, Kheirandish, S. 2017. A compendium of geochemical information from the Saanich Inlet water column. Scientific Data. 4:170159.PMID29087371

Welch RA, Burland V, Plunkett II G, Redford P, Roesche P, Rasko D, Buckles EL…Blattner FR. 2002. Extensive mosaic structure revealed by the complete genome sequence of uropathogenic Escherichia coli. Proc Natl Acad Sci USA. 99(26):17020-4 PMC139262

Module 04

Please see portfolio for project 1&2